1191 stories
·
0 followers

Research finds AI users scarily willing to "surrender" their cognition to LLMs

1 Share

When it comes to large language model-powered tools, there are generally two broad categories of users. On one side are those who treat AI as a powerful but sometimes faulty service that needs careful human oversight and review to detect reasoning or factual flaws in responses. On the other side are those who routinely outsource their critical thinking to what they see as an all-knowing machine.

Recent research goes a long way to forming a new psychological framework for that second group, which regularly engages in "cognitive surrender" to AI's seemingly authoritative answers. That research also provides some experimental examination of when and why people are willing to outsource their critical thinking to AI, and how factors like time pressure and external incentives can affect that decision.

Just ask the answer machine

In "Thinking—Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender," researchers from the University of Pennsylvania sought to build on existing scholarship that outlines two broad categories of decision-making: one shaped by "fast, intuitive, and affective processing" (System 1); and one shaped by "slow, deliberative, and analytical reasoning" (System 2). The onset of AI systems, the researchers argue, has created a new, third category of "artificial cognition" in which decisions are driven by "external, automated, data-driven reasoning originating from algorithmic systems rather than the human mind."

In the past, people have often used tools from calculators to GPS systems for a kind of task-specific "cognitive offloading," strategically delegating some jobs to reliable automated algorithms while using their own internal reasoning to oversee and evaluate the results. But the researchers argue that AI systems have given rise to a categorically different form of "cognitive surrender" in which users provide "minimal internal engagement" and accept an AI's reasoning wholesale without oversight or verification. This "uncritical abdication of reasoning itself" is particularly common when an LLM's output is "delivered fluently, confidently, or with minimal friction," they point out.

To measure the prevalence and effect of this kind of cognitive surrender to AI, the researchers performed a number of studies based on Cognitive Reflection Tests. These tests are designed to elicit incorrect answers from participants that default to "intuitive" (System 1) thought processes, but to be relatively simple to answer for those who use more "deliberative" (System 2) thought processes.

Test subjects who consulted AI were overwhelmingly willing to accept its answers without scrutiny, whether correct or not. Credit: Shaw and Nave

For their experiments, the researchers provided participants with optional access to an LLM chatbot that had been modified to randomly provide inaccurate answers to the CRT questions about half the time (and accurate answers the other half). The researchers hypothesized that users who frequently consulted the chatbot would let those incorrect answers "override intuitive and deliberative processes," hurting their overall performance and highlighting the dangers of cognitive surrender.

In one study, an experimental group with access to this modified AI consulted it for help with about 50 percent of the presented CRT problems. When the AI was accurate, those AI users accepted its reasoning about 93 percent of the time. When the AI was randomly "faulty," though, those users still accepted the AI reasoning a lower (but still high) 80 percent of the time, showing that the mere presence of the AI frequently "displaced internal reasoning," according to the researchers.

Unsurprisingly, the AI-using experimental group did much better than the "brain-only" control group when the AI provided accurate answers, and much worse than the control when the AI was inaccurate. Significantly, though, the group that used AI scored 11.7 percent higher on a measure of their own confidence in their answers, even though the LLM provided wrong answers half the time.

In another study, adding incentives (in the form of small payments) and immediate feedback for correct answers increased the likelihood that participants successfully overruled the faulty AI by 19 percentage points relative to the baseline, showing that salient consequences can encourage AI users to spend extra time verifying responses. But adding time pressures in the form of a 30-second timer decreased that tendency to correct the faulty AI by 12 percentage points, suggesting to the researchers that "when decision time is scarce, the internal monitor detecting conflict and recruiting deliberation is less likely to trigger."

"Lowering the threshold for scrutiny"

Overall, across 1,372 participants and over 9,500 individual trials, the researchers found subjects were willing to accept faulty AI reasoning a whopping 73.2 percent of the time, while only overruling it 19.7 percent of the time. The researchers say this "demonstrate[s] that people readily incorporate AI-generated outputs into their decision-making processes, often with minimal friction or skepticism." In general, "fluent, confident outputs [are treated] as epistemically authoritative, lowering the threshold for scrutiny and attenuating the meta-cognitive signals that would ordinarily route a response to deliberation," they write.

Subjects with high trust in AI were more likely to be misled by faulty responses, while those with high "Fluid IQ" were less likely to be misled by the AI. Credit: Shaw and Nave

These kinds of effects weren't uniform across all test subjects, though. Those who scored highly on separate measures of so-called fluid IQ were less likely to rely on the AI for help and were more likely to overrule a faulty AI when it was consulted. Those predisposed to see AI as authoritative in a survey, on the other hand, were much more likely to be led astray by faulty AI-provided answers.

Despite the results, though, the researchers point out that "cognitive surrender is not inherently irrational." While relying on an LLM that's wrong half the time (as in these experiments) has obvious downsides, a "statistically superior system" could plausibly give better-than-human results in domains such as "probabilistic settings, risk assessment, or extensive data," the researchers suggest.

"As reliance increases, performance tracks AI quality," the researchers write, "rising when accurate and falling when faulty, illustrating the promises of superintelligence and exposing a structural vulnerability of cognitive surrender."

In other words, letting an AI do your reasoning means your reasoning is only ever going to be as good as that AI system. As always, let the prompter beware.

Read full article

Comments



Read the whole story
Share this story
Delete

How can I use Read­Directory­ChangesW to know when someone is copying a file out of the directory?

2 Shares

A customer was using Read­Directory­ChangesW in the hopes of receiving a notification when a file was copied. They found that when a file was copied, they received a FILE_NOTIFY_CHANGE_LAST_ACCESS, but only once an hour. And they also got that notification even for operations unrelated to file copying.

Recall that Read­Directory­ChangesW and Find­First­Change­Notification are for detecting changes to information that would appear in a directory listing. Your program can perform a Find­First­File/Find­Next­File to cache a directory listing, and then use Read­Directory­ChangesW or Find­First­Change­Notification to be notified that the directory listing has changed, and you have to invalidate your cache.

But there are a lot of operations that don’t affect a directory listing.

For example, a program could open a file in the directory with last access time updates suppressed. (Or the volume might have last access time updates suppressed globally.) There is no change to the directory listing, so no event is signaled.

Functions like Read­Directory­ChangesW and Find­First­Change­Notification functions operate at the file system level, so the fundamental operations they see are things like “read” and “write”. They don’t know why somebody is reading or writing. All they know is that it’s happening.

If you are a video rental store, you can see that somebody rented a documentary about pigs. But you don’t know why they rented that movie. Maybe they’re doing a school report. Maybe they’re trying to make illegal copies of pig movies. Or maybe they simply like pigs.

If you are the file system, you see that somebody opened a file for reading and read the entire contents. Maybe they are loading the file into Notepad so they can edit it. Or maybe they are copying the file. You don’t know. Related: If you let people read a file, then they can copy it.

In theory, you could check, when a file is closed, whether all the write operations collectively combine to form file contents that match a collective set of read operations from another file. Or you could hash the file to see if it matches the hash of any other file.¹ But these extra steps would get expensive very quickly.

Indeed, we found during user research that a common way for users to copy files is to load them into an application, and then use Save As to save a copy somewhere else. In many cases, this “copy” is not byte-for-byte identical to the original, although it is functionally identical. (For example, it might have a different value for Total editing time.) Therefore, detecting copying by comparing file hashes is not always successful.²

If your goal is to detect files being “copied” (however you choose to define it), you’ll have to operate at another level. For example, you could use various data classification technologies to attach security labels to files and let the data classification software do the work of preventing files from crossing security levels. These technologies usually work best in conjunction with programs that have been updated to understand and enforce these data classification labels. (My guess is that they also use heuristics to detect and classify usage by legacy programs.)

¹ It would also generate false positives for files that are identical merely by coincidence. For example, every empty file would be flagged as a copy of every other empty file.

Windows 2000 Server had a feature called Single Instance Store which looked for identical files, but it operated only when the system was idle. It didn’t run during the copy operation. This feature was subsequently deprecated in favor of Data Deduplication, which looks both for identical files as well as identical blocks of files. Again, Data Deduplication runs during system idle time. It doesn’t run during the copy operation. The duplicate is detected only after the fact. (Note the terminology: It is a “duplicate” file, not a “copy”. Two files could be identical without one being a copy of the other.)

² And besides, even if the load-and-save method produces byte-for-byte identical files, somebody who wanted to avoid detection would just make a meaningless change to the document before saving it.

The post How can I use <CODE>Read­Directory­ChangesW</CODE> to know when someone is copying a file out of the directory? appeared first on The Old New Thing.

Read the whole story
Share this story
Delete

EPA Flags Microplastics, Pharmaceuticals As Contaminants In Drinking Water

1 Share
An anonymous reader quotes a report from NPR: Responding to public health concerns about microplastics and pharmaceuticals in the nation's drinking water, the Trump administration for the first time has placed them on a draft list of contaminants maintained by the Environmental Protection Agency. The EPA announced the move Thursday, touting it as a "historic step" for the Make America Healthy Again, or MAHA, movement, which often raises concerns about toxic chemicals and plastic pollution in our food and environment. Also Thursday, the Department of Health and Human Services announced a $144 million initiative, called STOMP, to develop tools to measure and monitor microplastics in drinking water and in a later stage, to remove them. The Safe Drinking Water Act requires the EPA to publish an updated version of its Contaminant Candidate List every five years. This is the sixth iteration of the list. Microplastics and pharmaceuticals appear in the draft of the upcoming list, alongside per- and polyfluoroalkyl substances, or PFAS, and dozens of other chemicals and microbes. Their inclusion on the list gives local regulators a tool to evaluate risks in their water supply, the EPA says, and it can set the stage for more research and regulatory action -- but doesn't actually guarantee that will happen.

Read more of this story at Slashdot.

Read the whole story
Share this story
Delete

Mount Everest Climbers 'Poisoned' By Guides In Insurance Fraud Scheme

1 Share
schwit1 shares a report from the Kathmandu Post: In Nepal, helicopter rescue on high altitude is, by any measure, a genuine lifesaving operation. At high altitude, where oxygen thins and weather changes without warning, the ability to airlift a stricken trekker to Kathmandu within hours has saved countless lives. But threaded through that legitimate system, exploiting its urgency, its opacity, and its distance from oversight, is one of the most sophisticated insurance fraud networks in the world. Nepal's fake rescue scam is not new. The Kathmandu Post first exposed it in 2018. Months later, the government convened a fact-finding committee, produced a 700-page report, and announced reforms. In February 2019, The Kathmandu Post published a long investigative report. Last year, Nepal Police's Central Investigation Bureau reopened the file, and what they found is that the fraud did not stop -- instead it was growing. The mechanics of the fake rescue racket are straightforward: stage a medical emergency, call in a helicopter, check a tourist into a hospital, and file an insurance claim that bears little resemblance to what actually happened. But the sophistication lies in how each link in the chain is compensated, and how difficult it is for a foreign insurer -- operating from Australia and the United Kingdom -- to verify events that occurred at 3,000 metres in a remote Himalayan valley. The CIB investigation identifies two primary methods for manufacturing an "emergency." The first involves tourists who simply don't want to walk back. After completing a demanding trek -- an Everest Base Camp trek, for instance, can take up to two weeks on foot -- guides offer an alternative: pretend to be sick, and a helicopter will come. The guide handles the rest. The second method is more troubling. At altitudes above 3,000 meters, mild symptoms of altitude sickness are common. Blood oxygen saturation can drop, hands and feet tingle, headaches develop. In most cases, rest, hydration or a gradual descent is all that is needed. But guides and hotel staff, according to the CIB investigation, have been trained to terrify trekkers at precisely this moment. They tell them they are at risk of dying, that only immediate evacuation will save them. In some cases, investigators found that Diamox (Acetazolamide) tablets, used to prevent altitude sickness, were administered alongside excessive water intake to induce the very symptoms that would justify a rescue call. In at least one case cited in the investigation, baking powder was mixed into food to make tourists physically unwell. Once a "rescue" is called, the financial choreography begins. A single helicopter carries multiple passengers. But separate, full-price invoices are submitted to each passenger's insurance company, as if each had their own dedicated flight. A $4,000 charter becomes a $12,000 claim. Fake flight manifests and load sheets are fabricated. At the hospital, medical officers prepare discharge summaries using the digital signatures of senior doctors who were never involved in the case. In some cases, these are done without those doctors' knowledge. Fake admission records are created for tourists who were, in some documented instances, drinking beer in the hospital cafeteria at the time they were supposedly receiving treatment. In one case, an office assistant at Shreedhi Hospital admitted that he had provided his own X-ray report taken about a year ago at a different hospital, to be used as a case for treatment of foreign trekkers to claim insurance. The commission structure that holds the network together was described in detail during police interrogations. Hospitals pay 20 to 25 percent of the insurance payment to trekking companies and a further 20 to 25 percent to helicopter rescue operators in exchange for patient referrals. Trekking guides and their companies benefit from inflated invoices. In some cases, tourists themselves are offered cash incentives to participate.

Read more of this story at Slashdot.

Read the whole story
Share this story
Delete

Why doesn’t the system let you declare your own messages to have the same semantics as WM_COPY­DATA?

2 Shares

In a comment on my discussion on how to return results back from the WM_COPY­DATA message, Jan Ringoš observed that it felt wasteful that there was this entire infrastructure for copying blocks of memory via a window message, yet only one message uses it! “I always thought something like EnableWindowMessageDataCopy (HWND, UINT, .) after RegisterWindowMessage and ChangeWindowMessageFilterEx to get application’s own private WM_COPYDATA would be a little more secure and convenient, should the programmer didn’t wish to bother with creating shared memory.”

The infrastructure for copying blocks of memory via a window message is used by far more than just one message! The WM_SET­TEXT and WM_GET­TEXT message use it for passing string buffers, the WM_HELP message uses it for passing the HELPINFO structure, the WM_MDICREATE message uses it for passing the MDICREATSTRUCT structure, and plenty more where those came from. The infrastructure for copying blocks of memory had already existed; it wasn’t created just for the WM_COPY­DATA message. adding WM_COPY­DATA support was just adding a few lines of code to the common function whose job is to prepare messages to be sent between processes (including copying memory between processes).

Suppose there were a way for a program to declare that one of its custom messages should have (say) its lParam be a pointer to data and its wParam be the size of the data. That could be misleading because the only behavior would be copying the memory block and not the data inside it. For example, if the structure contained pointers, the pointers would just be copied as raw values, rather than adding the pointed-to-data to the memory block and adjusting the pointers to point to the copy. It also doesn’t handle the case of sending the message between programs with different pointer or handle sizes, say between a 32-bit program and a 64-bit program.¹ If you need to copy data structures that consists of anything more than scalars (or aggregates of scalars), you’ll have to do your own marshaling to convert your source data structure into a transfer buffer. In practice, this means that sending the message directly with an as-is buffer is unlikely to be the common case; some type of conversion would have to be made anyway.

Furthermore, the WM_COPY­DATA already knew that you wanted to do this, because it left room for it in the COPY­DATA­STRUCT:

typedef struct tagCOPYDATASTRUCT {
  ULONG_PTR dwData; // ← here
  DWORD     cbData;
  PVOID     lpData;
} COPYDATASTRUCT, *PCOPYDATASTRUCT;

In addition to describing the memory buffer, there is this extra guy called dwData. You can put your “message number” in there, allowing you to multiplex multiple “messages” into a single WM_COPY­DATA message.²

You don’t need Enable­Window­Message­Data­Copy because you already have it at home. The window manager is more concerned with enabling things that weren’t possible before, rather than making it easier to do things that are already possible. For that, you can use a helper library.

Bonus chatter: In addition to adding complexity to the window manager implementation, allowing programs to customize how messages are marshaled between processes would also make it harder to explain how inter-process marshaling works. Instead of the simple rule “The system marshals messages in the system range, but not messages in the user-defined range,” it would be a much more ambiguous rule: “The system marshals messages in the system range, but not messages in the user-defined range, unless those messages have been customized by a call to Enable­Window­Message­Data­Copy, in which case they marshal by this alternate set of rules.” So now when you look at a message, you can’t tell how it marshals. You’d have to go back to the documentation for the message and hope the person who wrote the documentation remembered to go back and add a section to each page to say whether it follows custom marshaling.

¹ Or between a 16-bit program and a 32-bit program, which was the more common case back in the days when WM_COPY­DATA was designed. In 16-bit code, an int is a 16-bit integer, whereas it’s a 32-bit value in 32-bit code.

² If the dwData was intended to be a message number, why is it pointer-sized? For the same reason timer IDs and dialog control IDs are 64-bit values: “Pointers are like weeds. Anywhere it’s possible to fit a pointer, a pointer will try to squeeze in there.” In this case, people were putting handles (which are pointer-sized) in the dwData, so we had to make it big enough to hold a handle.

The post Why doesn’t the system let you declare your own messages to have the same semantics as <CODE>WM_<WBR>COPY­DATA</CODE>? appeared first on The Old New Thing.

Read the whole story
Share this story
Delete

Rapid Snow Melt-Off In American West Stuns Scientists

1 Share
Scientists say extreme March heat caused an unusually rapid collapse of snowpack across the American West that's leaving major basins at record or near-record lows. "This year is on a whole other level," said Dr Russ Schumacher, a Colorado State University climatologist. "Seeing this year so far below any of the other years we have data for is very concerning." The Guardian reports: [...] The issue is extremely widespread. Data from a branch of the US Department of Agriculture (USDA), which logs averages based on levels between 1991 and 2020, shows states across the south-west and intermountain west with eye-popping lows. The Great Basin had only 16% of average on Monday and the lower Colorado region, which includes most of Arizona and parts of Nevada, was at 10%. The Rio Grande, which covers parts of New Mexico, Texas and Colorado, was at 8%. "This year has the potential of being way worse than any of the years we have analogues for in the past," Schumacher said. Even with near-normal precipitation across most of the west, every major river basin across the region was grappling with snow drought when March began, according to federal analysts. Roughly 91% of stations reported below-median snow water equivalent, according to the last federal snow drought update compiled on March 8. Water managers and climate experts had been hopeful for a March miracle -- a strong cold storm that could set the region on the right track. Instead, a blistering heatwave unlike any recorded for this time of year baked the region and spurred a rapid melt-off. "March is often a big month for snowstorms," Schumacher said. "Instead of getting snow we would normally expect we got this unprecedented, way-off-the-scale warmth." More than 1,500 monthly high temperature records were broken in March and hundreds more tied. The event was "likely among the most statistically anomalous extreme heat events ever observed in the American south-west," climate scientist Daniel Swain said in an analysis posted this week. "Beyond the conspicuous 'weirdness' of it all," Swain added, "the most consequential impact of our record-shattering March heat will likely be the decimation of the water year 2025-26 snowpack across nearly all of the American west." Calling the toll left by the heat "nothing short of shocking," Swain noted that California was tied for its worst mountain snowpack value on record. While the highest elevations are still coated in white, "lower slopes are now completely bare nearly statewide."

Read more of this story at Slashdot.

Read the whole story
Share this story
Delete
Next Page of Stories